Coefficient of determination
Definition
. One minus the residual sum of squares (unexplained variance), normalized by the total sum of squares. See the figure in https://en.wikipedia.org/wiki/Coefficient_of_determination#Definitions
Let to be the predicted value of by the given model. As the model becomes better, the residual sum of squares should decrease. But how small is small?
One baseline “model” that we can assume is to predict where (mean). In other words, the sum of squares for this “null” model gives us a baseline. The reduction of residual sum of squares (RSS) by the regression can be written as: We can then normalize this by variance in the data. is now simply:
Adjusted
More explanatory variables spuriously increase the value, so the adjusted is often used when there are multiple explanatory variables.
Issues
- Why Iām Not a Fan of R-Squared by John Myles White
- Cosma Shalizi‘s “rant” about : http://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/10/lecture-10.pdf
The issue with models without intercept
Supposedly, R uses a baseline model that is , instead of when there is no intercept.
from sklearn.linear_model import LinearRegression
import numpy as np
# Generate synthetic data
np.random.seed(0)
n = 100
X = np.linspace(0, 10, n).reshape(-1, 1)
y = 3 + 2 * X.flatten() + np.random.normal(0, 1, n)
# Fit linear model with intercept
model_with_intercept = LinearRegression(fit_intercept=True)
model_with_intercept.fit(X, y)
y_pred_with_intercept = model_with_intercept.predict(X)
r2_with_intercept = model_with_intercept.score(X, y)
# Fit linear model without intercept
model_without_intercept = LinearRegression(fit_intercept=False)
model_without_intercept.fit(X, y)
y_pred_without_intercept = model_without_intercept.predict(X)
r2_without_intercept = model_without_intercept.score(X, y)
# Compute SST for models with and without intercept
SST_with_intercept = np.sum((y - np.mean(y))**2)
SST_without_intercept = np.sum(y**2)
# Compute R^2 manually for model without intercept
SSE_without_intercept = np.sum((y - y_pred_without_intercept)**2)
r2_without_intercept_manual = 1 - SSE_without_intercept / SST_without_intercept
r2_with_intercept, r2_without_intercept, r2_without_intercept_manual, SST_with_intercept, SST_without_intercept
# Load required library
library(lmtest)
# Generate synthetic data
set.seed(0)
n <- 100
X <- seq(0, 10, length.out = n)
y <- 3 + 2 * X + rnorm(n, 0, 1)
# Fit linear model with intercept
model_with_intercept <- lm(y ~ X)
summary(model_with_intercept)
# Compute SST for model with intercept
SST_with_intercept <- sum((y - mean(y))^2)
# Fit linear model without intercept
model_without_intercept <- lm(y ~ X - 1)
summary(model_without_intercept)
# Compute SST for model without intercept
SST_without_intercept <- sum(y^2)
# Compute R^2 manually for model without intercept
SSE_without_intercept <- sum(residuals(model_without_intercept)^2)
R2_without_intercept_manual <- 1 - SSE_without_intercept / SST_without_intercept
cat("SST with intercept:", SST_with_intercept, "\n")
cat("SST without intercept:", SST_without_intercept, "\n")
cat("R2 without intercept (manual):", R2_without_intercept_manual, "\n")